2 research outputs found

    DeSyRe: on-Demand System Reliability

    No full text
    The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

    Single event upset mitigation techniques in reconfigurable hardware

    No full text
    Advances in semiconductor technology using smaller sizes of transistors in order to fit more of them in the same area and increase performance, pose a threat for the reliability of integrated circuits. Technology scaling accelerates transistor ageing and degradation, causing more faults during the lifetime of an integrated circuit. Sources of faults such as manufacturing defects, degradation and ageing of transistors degrade the performance of integrated circuits leading to faults with a permanent effect that might be catastrophic for certain applications. A special case of integrated circuits, FPGAs, suffer from radiation-induced faults since they contain million of bits for the configuration of their resources that if flipped due to radiation might change the intended functionality of the application running on the FPGA, causing a failure. However, FPGAs can be dynamically reconfigured in the field and mitigate radiation effects providing fault-tolerance and high availability. A novel fault-tolerant architecture for an artificial pancreas application is proposed that consists of a mixed substrate of ASIC and FPGA. Fault detection is provided through modular redundancy, and dynamic reconfiguration is used as a repair mechanism. Experimental results show that 5,100x lower probability of failures per hour (PFH) than a DMR for permanent faults can be achieved with 2.4x more area than DMR. In addition, the proposed solution achieves 83x lower PFH than a TMR with 1.6x area overheads when considering transient faults. A framework supporting fault injection at the configuration memory of an SRAM FPGA and scrubbing was developed throughout this work. The framework supports various SEU and scrub rates and is implemented on the modern ZYNQ FPGA architecture. Existing scrubbing strategies were implemented for a second-order polynomial case study together with two new scrubbing techniques taking into consideration area information of the modules of the application. Experimental results show that the area-driven scrubbing technique achieves 43.6% LUTs and 40.9% REGs savings when compared to a DMR design. The area-driven technique for the partial TMR design saves 15% LUTs and 23% REGs area as compared to the TMR without sacrificing availability, but with increased power consumption for scrubbing. The conclusion of the work is that dynamic reconfiguration techniques can be effectively applied in FPGAs for trading-off resources and power consumption for availability.Open Acces
    corecore